Analysing E-mail Text Authorship for Forensic Purposes
نویسندگان
چکیده
E-mail has become the most popular Internet application and with its rise in use has come an inevitable increase in the use of e-mail for criminal purposes. It is possible for an e-mail message to be sent anonymously or through spoofed servers. Computer forensics analysts need a tool that can be used to identify the author of such e-mail messages. This thesis describes the development of such a tool using techniques from the fields of stylometry and machine learning. An author’s style can be reduced to a pattern by making measurements of various stylometric features from the text. E-mail messages also contain macro-structural features that can be measured. These features together can be used with the Support Vector Machine learning algorithm to classify or attribute authorship of e-mail messages to an author providing a suitable sample of messages is available for comparison. In an investigation, the set of authors may need to be reduced from an initial large list of possible suspects. This research has trialled authorship characterisation based on sociolinguistic cohorts, such as gender and language background, as a technique for profiling the anonymous message so that the suspect list can be reduced.
منابع مشابه
Language and Gender Author Cohort Analysis of E-mail for Computer Forensics
We describe an investigation of authorship gender and language background cohort attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-m...
متن کاملTechnology Corner: Analysing E-mail Headers For Forensic Investigation
Electronic Mail (E-Mail), which is one of the most widely used applications of Internet, has become a global communication infrastructure service. However, security loopholes in it enable cybercriminals to misuse it by forging its headers or by sending it anonymously for illegitimate purposes, leading to e-mail forgeries. E-mail messages include transit handling envelope and trace information i...
متن کاملBest Practices and Admissibility of Forensic Author Identification
Forensic linguistics provides answers to four categories of inquiry in investigative and legal settings: (i) identification of author, language, or speaker; (ii) intertextuality, or the relationship between texts; (iii) text-typing or classification of text types such as threats, suicide notes, or predatory chat; and (iv) linguistic profiling to assess the author’s dialect, native language, age...
متن کاملA Novel Approach of Mining Write-Prints for Authorship Attribution in E-mail Forensics
There is an alarming increase in the number of cybercrime incidents through anonymous e-mails. The problem of e-mail authorship attribution is to identify the most plausible author of an anonymous e-mail from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author an...
متن کاملA Profile-Based Authorship Attribution Approach to Forensic Identification in Chinese Online Messages
With the popularity of Internet technologies and applications, inappropriate or illegal online messages have become a problem for the society. The goal of authorship attribution for anonymous online messages is to identify the authorship from a group of potential suspects for investigation identification. Most previous contributions focused on extracting various writing-style features and emplo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003